Smart Paradigms and the Predictability and Complexity of Inflectional Morphology

نویسندگان

  • Grégoire Détrez
  • Aarne Ranta
چکیده

Morphological lexica are often implemented on top of morphological paradigms, corresponding to different ways of building the full inflection table of a word. Computationally precise lexica may use hundreds of paradigms, and it can be hard for a lexicographer to choose among them. To automate this task, this paper introduces the notion of a smart paradigm. It is a metaparadigm, which inspects the base form and tries to infer which low-level paradigm applies. If the result is uncertain, more forms are given for discrimination. The number of forms needed in average is a measure of predictability of an inflection system. The overall complexity of the system also has to take into account the code size of the paradigms definition itself. This paper evaluates the smart paradigms implemented in the open-source GF Resource Grammar Library. Predictability and complexity are estimated for four different languages: English, French, Swedish, and Finnish. The main result is that predictability does not decrease when the complexity of morphology grows, which means that smart paradigms provide an efficient tool for the manual construction and/or automatically bootstrapping of lexica.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Morphology of Romance, Germanic and Slavic Languages with the Tool Linguistica

In this paper we present preliminary work conducted on semi-automatic induction of inflectional paradigms from non annotated corpora using the open-source tool Linguistica (Goldsmith 2001) that can be utilized without any prior knowledge of the language. The aim is to induce morphology information from corpora such as to compare languages and foresee the difficulty to develop morphosyntactic le...

متن کامل

A Non-parametric Model for the Discovery of Inflectional Paradigms from Plain Text Using Graphical Models over Strings

The field of statistical natural language processing has been turning toward morphologically rich languages. These languages have vocabularies that are often orders of magnitude larger than that of English, since words may be inflected in various different ways. This leads to problems with data sparseness and calls for models that can deal with this abundance of related words—models that can le...

متن کامل

Language Learning ISSN 0023-8333 Introduction. Beyond the Obvious: Do Second Language Learners Process Inflectional Morphology?

Given that this special issue is devoted to the acquisition and processing of inflectional morphology by second language (L2) learners, the question in the title may appear redundant. However, recent research on first language (L1) and L2 morphological processing has challenged basic assumptions about the status of inflectional morphology in linguistic processing that had long been taken for gr...

متن کامل

A Language-Independent Feature Schema for Inflectional Morphology

This paper presents a universal morphological feature schema that represents the finest distinctions in meaning that are expressed by overt, affixal inflectional morphology across languages. This schema is used to universalize data extracted from Wiktionary via a robust multidimensional table parsing algorithm and feature mapping algorithms, yielding 883,965 instantiated paradigms in 352 langua...

متن کامل

A High-level Morphological Description Language Exploiting Inflectional Paradigms

A high-level language lor the description of inflectional morphology is presented, in which the organization of word lormation rules into an ii~herilance hierarchy of paradigms allows l o r a natural encoding of the kinds of nfles typically pre~uted in grammar txroks. We show how tim language, composed of orthographic rides, word formation rules, and paradigm inheritance, can be compiled into a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012